Universal Dependencies for Sanskrit
نویسنده
چکیده
We present the first steps towards a treebank of Sanskrit within the Universal Dependencies framework. Our dataset is tiny at the moment, consisting of less than 200 sentences—a result of a summer internship project. Nevertheless, this seems to be, to the best of our knowledge, the first publicly available piece of syntactically annotated Sanskrit text. We also present a parsing experiment, with results surpassing delexicalized parsing.
منابع مشابه
An annotation scheme for Persian based on Autonomous Phrases Theory and Universal Dependencies
A treebank is a corpus with linguistic annotations above the level of the parts of speech. During the first half of the present decade, three treebanks have been developed for Persian either originally or subsequently based on dependency grammar: Persian Treebank (PerTreeBank), Persian Syntactic Dependency Treebank, and Uppsala Persian Dependency Treebank (UPDT). The syntactic analysis of a sen...
متن کاملConverting Phrase Structures to Dependency Structures in Sanskrit
Two annotations schemes for presenting the parsed structures are prevalent viz. the constituency structure and the dependency structure. While the constituency trees mark the relations due to positions, the dependency relations mark the semantic dependencies. Free word order languages like Sanskrit pose more problems for constituency parses since the elements within a phrase are dislocated. In ...
متن کاملUniversal Decompositional Semantics on Universal Dependencies
We present a framework for augmenting data sets from the Universal Dependencies project with Universal Decompositional Semantics. Where the Universal Dependencies project aims to provide a syntactic annotation standard that can be used consistently across many languages as well as a collection of corpora that use that standard, our extension has similar aims for semantic annotation. We describe...
متن کاملHow Formal Concept Lattices Solve a Problem of Ancient Linguistics
In his grammar of ancient Sanskrit, Pān. ini represents the phonological classes as intervals of a list. This representation method and especially the actual list constructed by Pān. ini, which is called the Śivasūtras, earns universal admiration. The legend says that god Śiva revealed the Śivasūtras to Pān. ini in order to let him start developing his grammar of Sanskrit. A question still disc...
متن کاملLinguistic Typology meets Universal Dependencies
Current work on universal dependency schemes in NLP does not make reference to the extensive typological research on language universals, but could benefit since many principles are shared between the two enterprises. We propose a revision of the syntactic dependencies in the Universal Dependencies scheme (Nivre et al. [16, 17]) based on four principles derived from contemporary typological the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017